This page provides two-click reproductions* for a number of experimental runs on the Mr. TyDi dataset. Instructions for programmatic execution are shown at the bottom of this page. The dataset is described in the following paper:
Xinyu Zhang, Xueguang Ma, Peng Shi, and Jimmy Lin. Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval. Proceedings of 1st Workshop on Multilingual Representation Learning, pages 127-137, November 2021, Punta Cana, Dominican Republic.
Key:
MRR@100, test queries | ar | bn | en | fi | id | ja | ko | ru | sw | te | th | avg | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BM25 | 0.368 | 0.418 | 0.140 | 0.284 | 0.376 | 0.212 | 0.285 | 0.316 | 0.389 | 0.528 | 0.401 | 0.338 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (split) pFT NQ | 0.291 | 0.296 | 0.291 | 0.205 | 0.271 | 0.212 | 0.234 | 0.282 | 0.188 | 0.110 | 0.171 | 0.232 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (tied) pFT NQ | 0.221 | 0.254 | 0.243 | 0.244 | 0.281 | 0.206 | 0.223 | 0.250 | 0.262 | 0.097 | 0.158 | 0.222 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (tied) pFT MS MARCO | 0.441 | 0.397 | 0.327 | 0.275 | 0.352 | 0.311 | 0.282 | 0.356 | 0.342 | 0.310 | 0.269 | 0.333 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (tied) pFT MS MARCO + FT all | 0.695 | 0.623 | 0.492 | 0.559 | 0.578 | 0.501 | 0.486 | 0.516 | 0.644 | 0.891 | 0.618 | 0.600 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
Recall@100, test queries | ar | bn | en | fi | id | ja | ko | ru | sw | te | th | avg | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BM25 | 0.793 | 0.869 | 0.536 | 0.720 | 0.843 | 0.643 | 0.619 | 0.654 | 0.764 | 0.897 | 0.853 | 0.745 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (split) pFT NQ | 0.650 | 0.793 | 0.678 | 0.568 | 0.685 | 0.584 | 0.532 | 0.647 | 0.528 | 0.366 | 0.515 | 0.595 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (tied) pFT NQ | 0.600 | 0.707 | 0.689 | 0.640 | 0.691 | 0.573 | 0.550 | 0.618 | 0.597 | 0.245 | 0.455 | 0.579 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (tied) pFT MS MARCO | 0.797 | 0.784 | 0.754 | 0.647 | 0.736 | 0.732 | 0.617 | 0.743 | 0.634 | 0.782 | 0.595 | 0.711 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
||||||||||||||
mDPR (tied) pFT MS MARCO + FT all | 0.900 | 0.955 | 0.841 | 0.856 | 0.861 | 0.813 | 0.785 | 0.843 | 0.876 | 0.966 | 0.883 | 0.871 | ||
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
Command to generate run:
Evaluation commands:
|
All experimental runs shown in the above table can be programmatically executed based on the instructions below. To list all the experimental conditions:
python -m pyserini.2cr.mrtydi --list-conditions
Run all languages for a specific condition and show commands:
python -m pyserini.2cr.mrtydi --condition bm25 --display-commands
Run a particular language for a specific condition and show commands:
python -m pyserini.2cr.mrtydi --condition bm25 --language ko --display-commands
Run all languages for all conditions and show commands:
python -m pyserini.2cr.mrtydi --all --display-commands
With the above commands, run files will be placed in the current directory. Use the option --directory runs to place the runs in a sub-directory.
For a specific condition, just show the commands and do not run:
python -m pyserini.2cr.mrtydi --condition bm25 --display-commands --dry-run
This will generate exactly the commands for a specific condition above (corresponding to a row in the table).
For a specific condition and language, just show the commands and do not run:
python -m pyserini.2cr.mrtydi --condition bm25 --language ko --display-commands --dry-run
For all conditions, just show the commands and do not run and skip evaluation:
python -m pyserini.2cr.mrtydi --all --display-commands --dry-run --skip-eval
Finally, to generate this page:
python -m pyserini.2cr.mrtydi --generate-report --output docs/2cr/mrtydi.html
The output file mrtydi.html should be identical to this page.